Big Data Europe
نویسندگان
چکیده
e BigDataEurope (BDE) project is developing exactly the kind of computing infrastructure that European stakeholders need when handling large volumes of data in a variety of formats; the results are open-source and their use is completely free. Coordinated by Fraunhofer IAIS, BDE is working directly with partners that represent the seven Societal Challenges identied by the European Commission (Health, Food, Energy, Transport, Climate, Social Sciences and Security). For each community, a pilot that makes use of BDEs technology stack to address the Big Data needs identied by these challenges is well under way. 1 THE BIG DATA INTEGRATOR PLATFORM BDE’s Integrator Platform (BDI) makes the processing of big data simpler, cheaper and more exible than ever before. It oers basic building blocks to get started with common big data technologies and makes integration of dierent technologies and applications easy. Components such as Apache Spark, Hadoop HDFS, Apache Flink, Apache Flume and Apache Kaa can be built into a pipeline through a simple graphical UI. ose components can help handle the velocity and volume dimensions, but BDI is also leading the way in tackling that third big data problem: variety. is is done through BDI’s Semantic Data Lake and components like SANSA1 which performs analytics on semantically structured RDF data by providing out-of-the-box scalable algorithms for massive datasets. BDI is an open source platform based on Docker, today’s virtualisation technique of choice. It works on a local machine or on hundreds of nodes using Docker Swarm, and can run in-house, or within an external cloud environment (not provided by BDE). BDE applications are provided as docker containers, making their installation and set-up a 10-minute job. With the help of latest Docker features, BDI oers: • Swarm-based networking • Load Balancing • Service Discovery • Multi-host networking with integrated KV-Store • Fault tolerance Docker Compose helps to create multiple containers on multiple nodes using a single command and a single compose le. Docker Compose V2 and Docker Swarm aim to implement full integration, 1hp://sansa-stack.net/ which means that it is feasible to point a Compose app at a Swarm cluster and make its use possible in the same manner as if a single Docker host is being used. It is notable that the latest Docker components provide greater resemblance to Kubernetes in terms of orchestration features, and Swarm presents a beer choice in terms of shiing from a local/development environment to a cluster. e BDE Team provides baseline Docker images for Apache Hadoop, Spark, Flink and many others. Components were selected based on the requirements gathered from the seven Societal Challenges. us, the Platform makes it feasible to perform a variety of big data tasks, including message passing (Kaa, Flume), storage (Hive, Cassandra). e platform is able to handle RDF triples at scale using components like FOX, SemaGrow and 4Store; with particular emphasis on the triplication of geospatial data using GeoTriples, Sextant and Strabon. BDI has enriched the Docker platform, a high-level depiction of which is shown in Figure 1, with a layer of supporting services, helping in the setup, maintenance and monitoring of the pipeline and workows: • e Init daemon allows to dene workows by monitoring the start-up status of inter-dependent Docker components. • e Pipeline Service and Builder are developed to support the creation of workows. • e Pipeline Monitor front-end demonstrates the current status of the Docker components. • e Integrator UI integrates the dierent ocialWeb UIs of select pipeline components under one Integrated and personalised view. Furthermore, the Swarm UI visualises the status of a swarm cluster and allows to scale and monitor the cluster services. Figure 1: BDI platform’s high-level modular architecture For BDI platform progress updates please refer to the dedicated page2; or try it out or engage with our community3.
منابع مشابه
Big data impact on society: a research roadmap for Europe
With its rapid growth and increasing adoption, big data is producing a growing impact in society. Its usage is opening both opportunities such as new business models and economic gains and risks such as privacy violations and discrimination. Europe is in need of a comprehensive strategy to optimise the use of data for a societal benefit and increase the innovation and competitiveness of its pro...
متن کاملJose Maria Cavanillas, Edward Curry, and Wolfgang Wahlster (editors): new horizons for a data-driven economy: a roadmap for usage and exploitation of big data in Europe
Read more and get great! That's what the book enPDFd new horizons for a data driven economy a roadmap for usage and exploitation of big data in europe will give for every reader to read this book. This is an on-line book provided in this website. Even this book becomes a choice of someone to read, many in the world also loves it so much. As what we talk, when you read more every page of this ne...
متن کاملBig-Science facilities in Europe need greater coordination of resources
The leading role in science played by crystallography is heavily dependent on Big-Science facilities. The need for Europe-wide coordination of operational resources in Big Science is discussed with particular reference to neutron sources.
متن کاملThe Big Data Value Chain: Definitions, Concepts, and Theoretical Approaches
The emergence of a new wave of data from sources, such as the Internet of Things, Sensor Networks, Open Data on the Web, data from mobile applications, social network data, together with the natural growth of datasets inside organisations (Manyika et al. 2011), creates a demand for new data management strategies which can cope with these new scales of data environments. Big data is an emerging ...
متن کاملSocietal impacts of big data: challenges and opportunities in Europe
This paper presents the risks and opportunities of big data and the potential social benefits it can bring. The research is based on an analysis of the societal impacts observed in a set of six case studies across different European sectors. These impacts are divided into economic, social and ethical, legal and political impacts, and affect areas such as improved efficiency, innovation and deci...
متن کاملThe stakes of Big Data in the IT industry China as the next global challenger?
The information society relies on services for communicating, sharing, networking, searching, buying, etc. which are mostly provided by large corporations, such as Google, Facebook, or Amazon. The Web connects all regions in the World, but its most popular services are ensured by a handful of corporations which are almost all in the USA. While Europe is relying on the American industry in an es...
متن کامل